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Characterizing complex quantum systems is a vital task in quantum information science. Quan¬ 
tum tomography, the standard tool used for this purpose, uses a well-designed measurement record 
to reconstruct quantum states and processes. It is, however, notoriously inefficient. Recently, the 
classical signal reconstruction technique known as “compressed sensing” has been ported to quantum 
information science to overcome this challenge: accurate tomography can be achieved with substan¬ 
tially fewer measurement settings, thereby greatly enhancing the efficiency of quantum tomography. 

Here we show that compressed sensing tomography of quantum systems is essentially guaranteed 
by a special property of quantum mechanics itself -that the mathematical objects that describe the 
system in quantum mechanics are matrices with nonnegative eigenvalues. This result has an impact 
on the way quantum tomography is understood and implemented. In particular, it implies that 
the information obtained about a quantum system through compressed sensing methods exhibits a 
new sense of “informational completeness.” This has important consequences on the efficiency of 
data taking for quantum tomography, and enables us to construct informationally complete mea¬ 
surements that are robust to noise and modeling errors. Moreover, our result shows that one can 
expand the numerical tool-box used in quantum tomography and employ highly efficient algorithms 
developed to handle large dimensional matrices on a large dimensional Hilbert space. While we 
mainly present our results in the context of quantum tomography, they apply to the general case of 
positive semidefinite matrix recovery. 


INTRODUCTION 

Determining an unknown signal from a set of measurements is a fundamental problem in science and engineering. 
However, as the number of free parameters defining the signal increases, its tomographic determination may become 
a daunting task. Fortunately, in many contexts there is prior information about the signal that may be useful for 
tomography. Compressed sensing m is a signal recovery technique developed for this aim. It utilizes specific types 
of prior information about the structure of the signal to substantially compress the amount of information needed 
to reconstruct it with high accuracy. In particular, it harnesses the prior information that the signal has a concise 
representation, e.g., that it is a sparse vector with a few nonzero elements or a low-rank matrix with a few nonzero 
singular values. The compressed sensing protocol then defines special classes of measurements, henceforth referred to 
as “compressed sensing measurements,” that enable the unique identification of the signal from within the restricted 
set of sparse vectors or low-rank matrices using substantially fewer measurement settings. Moreover, it provides 
algorithms for efficient reconstruction by defining a specific class of convex optimization heuristics whose solution 
determine the unknown signal from the measurement outcomes with very high accuracy (see Methods). Importantly, 
solving any other optimization programs outside this class will not necessarily result in a compressed sensing protocol. 

In the context of quantum information science, the “signals” we seek to reconstruct are, for example, quantum states 
and processes, and the protocol for reconstruction is quantum tomography. Because the number of free parameters 
in quantum states and processes scale poorly (growing as some power of the total Hilbert space dimension, which in 
turn grows exponentially with the number of subsystems), there has been a concerted effort to develop techniques 
that minimize the resources necessary for tomography. To this end, the methodology of compressed sensing has been 
applied to the problem of quantum tomography mm- 

In the pioneering work of HEME] it was proved that quantum measurements can be easily designed to be within the 
special class of measurements required for compressed sensing. Then, using the specifically chosen convex optimization, 
low-rank density matrices (close to pure quantum states) or low-rank process matrices (close to unitary evolutions) 
can be accurately reconstructed with a substantially reduced number of measurement settings. 

The work we report here identifies a critical link between quantum tomography and compressed sensing. We 
discuss in particular the case of quantum state tomography, where the aim is to recover the density matrix, a positive 
semidefinite matrix, typically normalized with unit trace. We show that the positivity property alone imposes a 
powerful constraint that places strong restrictions on the physical states that are consistent with the data. As 
illustrated in Fig. [l] this restriction is stronger than the one present in generic compressed sensing of signals which 
are not necessarily positive semidefinite matrices. This, in turn, has far reaching consequences. First and foremost, 
it implies that as long as quantum measurements are within the special class associated with compressed sensing, 
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then any optimization heuristic that contains the positivity constraint is effectively a compressed sensing protocol. 
Second, tools provided by the compressed sensing methodology now enable the construction of special types of 
informationally complete measurements that are robust to noise and to small model imperfections, with rigorous 
bounds. Finally, our results fundamentally unify many different quantum tomography protocols which were previously 
thought to be distinct, such as maximum-likelihood solvers, under the compressed sensing umbrella. We emphasize 
that constraining the normalization (trace) to a fixed value, as one does for density matrices, plays no role in the 
theorems we discuss below. Thus our results extend beyond the context of quantum state tomography, applying, 
e.g., to process tomography when the latter is described by a completely positive map, and more generally to the 
reconstruction of low-rank positive semidefinite matrices. 


RESULTS 

Informational completeness 

In quantum theory, a measurement is represented by a positive operator-valued measure, POVM, a set of positive 
semidefinite d x d matrices that form a resolution of the identity, 8 = {E^E^ > = 1}. The elements of 

a POVM represent the possible outcomes (events) of the measurement, and probability of measuring an outcome p, 
is given by the usual Born rule, p^ = Tr(E^p), where p is the state of the system, a positive semidefinite matrix, 
p > 0, normalized such that Trp =1. In the context of quantum-state tomography, informationally complete 
measurements play a central role. Let S be the set of all quantum states (density matrices). A measurement is said 
to be informationally complete if (22i 

\/p a ,Pb G <S, p a £ p b , 3 E^ G 8 s.t. Tr(E p p a ) ^ Tr (E^pb). (1) 

In other words, no two distinct states p a and p b yield the same measurement outcome probabilities. Thus, a (noise- 
free) record of an informationally complete measurement uniquely determines the state of the system. In general, 
for a d-dimensional Hilbert space, an informationally complete measurement consists of at least d 2 outcomes (POVM 
elements). 

While Eq. |l]) gives a general definition of an informationally complete measurement, if one has prior information 
about the state of the system, we can make this definition more specific [23] EH]- In particular, suppose the state 
is known a priori to be of a special class, V, e.g., the class of density matrices of at most rank r. One defines a 
measurement to be V restricted informationally complete (restricted-IC) if it can only uniquely identify a quantum 
state from within the subset V, but cannot necessarily uniquely identify it from within the set of all quantum states. 
Such V restricted-IC measurements can be composed of fewer outcomes than the d 2 outcomes required for a general 
informationally complete measurement. For example, Heinosaari et al. [231 showed that when V is the set of density 
matrices of at most rank r, then rank-r restricted-IC measurements can be constructed with 0{rd) outcomes, rather 
than 0(d 2 ) outcomes required for a general informationally complete measurement. One can formalize this definition 
in the context of quantum-state tomography. A measurement is said to be V restricted-IC, if [23] 

Vpai pb &V,Pa^ Pb , 3E M G 8 s.t. Tr (E^pa) ^ Tr (E^p b ). (2) 

In some situations, a measurement can satisfy a stricter definition of informational completeness than the V restricted- 
IC of Eq. ([2]). A measurement is said to be V strictly-IC, if [2T 

Vp a G V,\/p b G 5, p a ± p b , 3 E,j. G 8 s.t. Tr(E M p a ) £ Tr(E M p b ). (3) 

There is a subtle yet important difference in the definitions of V restricted-IC and V strictly-IC. While the measurement 
record of the former identifies a unique state within the set V. the measurement record of a the latter identifies a unique 
state within the set of all quantum states. These notions of informationally completeness are key to understanding 
compressed sensing and its application in quantum tomography, as we discuss below. 

The relation between informational completeness and compressed sensing 

At its heart, the compressed sensing methodology employs prior information to reduce the number of measurements 
required to reconstruct an unknown signal. Here we consider the compressed sensing recovery of a d x d Hermitian 
matrix, M. Let the measurement record be specified as a vector-valued linear map, yi = A[M]i = Tr(AjM), where 
A is known as the “sensing map.” In general, when the set {A;} forms a basis for d x d matrices with at least d 2 
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elements | 2'5| , then the measurement record is informationally complete in the sense of Eq. ([I]), and in the absence of 
measurement noise, the signal can be recovered uniquely. 

If, however, we know a priori that rank(M) < r, with r <C d, then we can substantially reduce the number of 
measurement samples required to uniquely reconstruct the unknown signal-matrix. This is codified in a theorem by 
Recht et al. [8] and Candes et al. [S] which we restate as follows: 

Theorem [compressed sensing]. Let the unknown signal Mq be a Hermitian matrix with rank(Mo) < r, and 
let y = A[Mq} be the measurement record obtained by a sensing map, A. that corresponds to compressing sensing 
measurements for rank r. Then M 0 is the unique Hermitian matrix within the set of low-rank Hermitian matrices (up 
to rank r) that is consistent with the measurement record. 

Importantly, in compressed sensing, when r«d, there are generally an infinite number of Hermitian matrices with 
rank larger than r that are consistent with the measurement record. Thus, the measurement record associated 
with compressed sensing cannot uniquely specify M 0 among all d x d Hermitian matrices, and therefore it is not 
informationally complete in the sense of Eq. ([I]). If, however, the sensing map A corresponds to compressed sensing 
measurements (e.g., it satisfies the restricted isometry property [3], see Methods), then according to the above theorem, 
the measurement record uniquely specifies Mq within the restricted set of low-rank Hermitian matrices (rank(M) < 
rCd). Therefore compressed sensing measurements correspond to rank-r restricted-IC, in the sense of Eq. ([2]). 

This relation between compressed sensing measurements and rank-r restricted-IC implies that any successful search 
must be restricted to the low rank set of Hermitian matrices. To achieve this, one solves the convex optimization 
problem umm , 


M = argmin||M||» s.t. y = A[M], (4) 

where ||M||* = TrVis the nuclear (or trace) norm, which serves as the convex proxy for rank minimization. 
Under the conditions above, the optimal solution is M = Mq , i.e., exact recovery. The use of the nuclear norm is 
essential here. If one uses only the compressed number of samples, solving any other optimization that is not related 
to the above rank-minimization heuristic by some regularization will not result in a successful recovery. For example, 
the solution of the convex programs argminM Tr(M) s.t. y = A[M\, and argminM \\y — M[M]|| 2 with m <C d 2 samples 
{y{\ will generally yield a solution that is very different from Mq. Such estimators generally require m ~ d 2 samples 
to recover Mq. The analogous result holds for compressed sensing of sparse vectors. There ones require minimization 
of the norm of the vector, a convex heuristic for vector-sparsity. 

In what follows, we specialize the compressed sensing paradigm to the case of positive matrix recovery, and particular 
to quantum-state tomography. There, the aim is to recover the state of the system, p, which has the key property of 
positivity, p > 0. 


The role of positivity in compressed sensing quantum tomography 

Our central result is summarized in the following theorem: 

Theorem 1 . Let Pq be a positive semidefinite matrix with rank(P 0 ) < r, and let y = .4 [Po] be the measurement 
record obtained by a sensing map A that corresponds to compressing measurements for a rank-r Hermitian matrix. 
Then Pq is the unique matrix within the set of positive semidefinite matrices of any rank that is consistent with the 
measurement record. 

This is an analogous theorem to the one presented by Bruckstein et al. [2BJ for the case of positive sparse vector 
solutions for an underdetermined set of linear equations. Its proof as well as the details concerning the requirements 
on the sensing map are given in the Supplementary information Section A. It also extends a result by Candes et 
al. m and Demanet and Hand [28l| from rank-1 matrices to matrices with rank < r for all permissible r. 

Theorem 1 differs qualitatively from the standard compressed sensing theorem in a few key aspects. As discussed 
above, the general theory of compressed sensing guarantees that if the signal is a low-rank matrix with rank < r, and if 
the sensing map corresponds to compressed sensing measurements, then the measurement record uniquely specifies the 
unknown signal-matrix, but only within the subset of matrices with rank < r. Theorem 1, on the other hand, states 
that if the matrix to be estimated is constrained to be a positive matrix (e.g., a density matrix), then the measurement 
record uniquely specifies the matrix from within the entire set of positive Hermitian matrices. Therefore, without the 
positivity constraint, compressed sensing measurements correspond to rank-r restricted-IC measurements of Eq. 
whereas under positivity, the same measurements correspond to rank-r strictly-IC measurements of Eq. ©• This 
central result of Theorem 1 is illustrated in Fig. [T| 
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FIG. 1. Schematic illustration of Theorem 1. (a) A generic compressed sensing scenario. The noiseless measurement 
record uniquely specifies the low-rank signal matrix Mo (represented by a red dot) within the set of Hermitian matrices with 
rank < r (the yellow non-convex set). However, there are many other Hermitian matrices with rank larger than r that are 
consistent with the measurement record (shown as the blue dots), (b) A generic scenario of compressed sensing of 
quantum states. If the noiseless measurement record comes from a density matrix, i.e., a positive matrix po > 0 whose rank 
< r, then, according to Theorem 1, it specifies po uniquely among the set of all positive matrices (shown as the red convex 
set). All other matrices that are consistent with the measurement record necessarily have negative eigenvalues and their rank 
is strictly larger than r. 


The implication of Theorem 1 for quantum-state tomography is as follows. Suppose that the state of the system 
Po, a positive semidefinite matrix, has rank < r. Assume that we have measured the system with a sensing map that 
satisfies the appropriate compressed sensing property, and obtained the (noiseless) measurement record A[po\ = p. 
Then, according to Theorem 1, po is the only density matrix within the set of positive Hermitian matrices of any 
rank that yields the measurement probabilities p. Geometrically, as observed in Hang, Theorem 1 states that the 
rank-deficient subset of the positive matrices cone is “pointed.” Therefore, under the promise that rank(po) < t and 
A corresponds to compressed sensing measurements, the space of matrices p that satisfy A[p\ = p and the cone of 
positive matrices intersect in a single point p = po- 

Theorem 1 implies that the solution set contains only one matrix, the density matrix p 0 . It follows that we can use 
any optimization method to search for it, and we are guaranteed to find it. Thus, we have the following result: Given 
a quantum measurement record p = A[po], such that rank(po) < r, and where A corresponds to compressed sensing 
measurements, then the solution to 

p = argminC(p) s.t. A[p] = p and p > 0, (5) 

p 

or to 

p = argmin \\A[p] — p|| s.t. p > 0, (6) 

p 

where C(p) is a any convex function of p, and || • || is any norm function, is unique: p = po- By confining the feasible 
set of matrices to positive matrices, we ensure that the measurement record uniquely identifies po from the set of 
all density matrices, and thus any convex function of p or the measurement error may serve as a cost function. For 
example, this result applies to maximum-(log)likelihood estimation [29] where C(p) = — log(J][ M Tr(E^p) p ^). We thus 
conclude that when the feasible set of density matrices is constrained to be physical (i.e., have positive eigenvalues), 
any quantum tomography protocol whose sensing map corresponds to compressed sensing measurements will exhibit 
the compressed sensing effect. We do not include a trace constraint in the convex programs above. In the noiseless 
case considered here it is redundant. Because the data came from a trace-preserving quantum measurements, the 
unique solution must be a normalized quantum state. As discussed in the Supplementary information, the constraints 
p > 0 and Trp = 1, taken together, immediately imply that po is the only density matrix consistent with the noiseless 
data. When we consider the important case of noisy measurements, the consequence trace constraint is nontrivial, as 
we discuss in the next section. 
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Robustness to measurement noise and model imperfection 

So far, we have discussed the ideal case of a noiseless measurement record, where in the context of quantum tomog¬ 
raphy, p denoted a probability vector. The compressed sensing methodology, however, assures a robust reconstruction 
of the signal in the presence of measurement noise. Our analysis inherits this crucial feature. In a realistic scenario, 
we allow for a noisy measurement record, / = -4[po] + e, where we assume that the noise contribution can be bounded 
by some norm ||e|| < e. In the context of quantum tomography we consider / to denote a vector of the observed 
frequencies of measurement outcomes. 

Theorem 1 ensures robust recovery of the positive density matrix if the noise level is small by solving any convex 
optimization problem. Under the assumptions of Theorem 1, any convex minimization problem that searches for 
a solution within the cone of positive matrices must yield a solution p such that \\p — poll < g(e), where g(e) —> 0 
as e —> 0. From a geometrical point of view, when the noisy data arises from a rank-deficient state, as we gain 
data, there are fewer states that could have given rise to that data because the convex set of physical states is 
highly constrained near the point. In the idealized limit of noiseless data, there is only one state compatible with 
the data. Therefore, qualitatively, we expect a compressed sensing effect no matter how we search for the solution 
whenever the data arises from low rank positive matrices. Quantitatively, of course, different heuristics may perform 
differently, yielding different estimates. Choosing the best optimization depends, in part, on the specific noise model. 
For example, in Supplementary information Section B we derive a specific bound on the Frobenius (Hilbert-Schmidt) 
norm ||p — Po||f = -\/Tr(p — po) 2 , where p is the solution of a nonnegative least-squares program 

p = argmin||A[p] - /|| 2 s.t. p > 0. (7) 

p 

Whereas the normalization constraint, Trp = 1 was unecessary in the noiseless case, in the case of a noisy measurement 
record, the convex optimization is not guaranteed to obtain a normalized state. One can include the trace constraint 
in the optimization, but it is generally unnecessary in the noisy case as well. In fact, sometimes one can actually 
improve the robustness to noise by choosing Trp ^ 1, as we discuss below. In general, the output of the optimization 
should then be renormalized to give the final estimate. 

We see this explicitly in m , where Gross et al. obtained a compressed sensing version of quantum-state tomography 
by solving the minimization problem, 


min. Trp s.t. ||/ - -4[p]|| 2 < e, p > 0. (8) 

p 

This is equivalent to minimizing the nuclear norm of p under the same constraints, i.e., when the feasible set is p > 0, 
then ||p||* = Trp. As noted above, minimizing the trace of the matrix in the absence of the positivity constraint is not 
equivalent to minimizing the nuclear norm, and therefore, would not achieve compressed sensing. While both Eq. 0 
and Eq. ([8]) are compressed sensing programs, in general they return different estimations. However, in Supplementary 
information Section C we show that the nonegative least-squares program, 

min. || A[p\ - /1| 2 s.t. Trp = t, p> 0 (9) 

p 

is exactly equivalent to the nuclear-norm minimization of Eq. (|8| for a particular choice of t. This fact was ob¬ 
served empirically in a recent experiment by Smith et al. m, in which quantum-state tomography via continuous 
measurement was achieved at a equivalent rate by both least-squares and trace minimization, with the positivity 
constraint included. The difference between the final estimate was attributed to a difference in the robustness of the 
two estimators to noise. Since Eqs. ([8]) and Q are formally equivalent, the noisy measurement can be equivalently 
accommodated by solving ([9]) with a choice of t that depends on the noise bound e. As always, we renormalize to 
obtain the final density matrix. 

In addition to noise in the measurements, there can be imperfections in the model. When the sensing map satisfies 
the restricted isometry property, the compressed sensing methodology is not restricted to exact rank-deficient signal 
matrices. It also ensures the robust recovery of the dominant rank-r part of the density matrix. Our analysis shares 
this important and nontrivial property. Lemma 2 given in Supplementary information Section B is the root of this 
feature. 

We have shown that Theorem 1 implies that for a positive matrix recovery, compressed sensing measurements 
correspond to a stronger notion of informationally completeness—a rank-r strictly-IC. This implies that for quantum 
tomography we can construct robust measurements that are also rank-r strictly-IC. The robustness to measurement 
noise and model imperfection is guaranteed by the compressed sensing theory. For example, in the context of a 
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many-qubit system, Liu m showed that 0(rd poly (log d)) expectation values of Pauli products, w = ®" =1 er a ., where 
cr a G {/, a Xl a y , cr z j, satisfy the restricted isometry property with overwhelming probability. Therefore, this set of 
expectation values is, with high probability, a robust rank-r strictly-IC measurement record. Similar results hold for 
sparse quantum process matrix reconstruction, e.g., it is shown in m that if the sensing map is constructed from 
random input states, and random observables, then the restricted isometry holds with high probability. 

Numerical test: Compressed sensing state tomography of n-qubit system. 

Gross and coworkers PH EMU studied the problem of quantum-state tomography of an n-qubit system and showed 
that m = 0(rd poly(logd)) expectation values of random Pauli observables satisfy an appropriate restricted isometry 
property with high probability. If these expectation values are obtained through measurements in Pauli bases, i.e., 
local projective measurements on individual qubits in the eigenbasis of the Pauli observables, then, in fact, we obtain 
much more information. In addition to the average values, we also obtain the frequency of occurrence of each outcome, 
= ®™ =1 P ai , where p indexes the series of cti, a = x,y,z, and P ai € {|ta;)(tail, |ia i )(4'a i |}- Thus, we expect that 
we can obtain the required information for high-fidelity reconstruction using substantially fewer measurements based 
on individual outcomes in random Pauli bases rather than expectation values, and further reduce the resources needed 
for quantum-state tomography of a collection of qubits. 

To exemplify this and the implication of Theorem 1, we perform numerical experiments on an n-qubit system (see 
Methods for details). In Fig. [2j we simulate measurements on a three qubit system, d = 8, and compare different 
numerical programs to estimate the state. In Fig. 0 we solve three estimators: Eq. Q (nuclear-norm minimization), 
minjvf ||p — M[M]|| 2 (least-squares), and min/vfTr(M) s.t. p = M[M] (trace minimization). Note that none of these 
estimators constrain the feasible set to the cone of positive matrices. The least-squares and trace minimization are 
not convex heuristics for rank minimization, and thus, as expected, they do not achieve compressed sensing. These 
programs require a full informationally complete measurement record in order to reconstruct the quantum state. On 
the other hand, the nuclear-norm heuristic does exhibit the compressed sensing effect, and recovers the density matrix 
with far fewer measurement outcomes. In Fig. 0 we use the same data as in Fig. [2 ^l, but here we use estimators that 
restrict the feasible set to positive semidefinite matrices, e.g., the nonnegative least-squares estimator, Eq. 0- The 
plots clearly show the implication of Theorem 1. Once restricted to the positive cone, the performance of all of the 
estimators is qualitatively the same and they all exhibit the compressed sensing effect. When the number of Pauli 
bases satisfy the appropriate restricted isometry property, the various estimators find the exact state in the idealized 
situation where the measurement record has no noise, and they find a good estimate that is close to the true state of 
the system in the presence of noise due to finite sampling statistics of 200 repetitions. 

In Fig. [3j we treat a large dimensional Hilbert space: a ten qubit system, d = 2 10 = 1024. We simulate 30 random 
Pauli bases of a Haar-random pure state with N rep = 100d repetitions for each observable. We estimate the state by 
solving Eq. 0 with a convex optimization program that can efficiently handle such large dimensional data sets [30] . 
The program implements a standard algorithm that uses gradient methods together with projection onto the positive 
cone. In the plot we see the compressed sensing effect due to the positivity constraint - all the information is captured 
in about 28 random Pauli bases, given sufficient statistics. 


DISCUSSION 

We have established a rigorous connection between the positivity property of quantum states and the compressed 
sensing method for quantum tomography. Thanks to the positivity constraint associated with physical states, the 
record of such measurements allows for a unique identification of a low rank quantum state within the set of all physical 
quantum states, of any rank. Thus, the measurements used for compressed sensing are informationally complete in a 
strict sense (strictly-IC). This aspect is fundamentally different than what happens if positivity is not included. In the 
absence of the positivity constraint, the compressed sensing measurements are informationally complete in a restricted 
sense since they only uniquely identify a signal matrix from within the set of low rank matrices (restricted-IC). 

This strict relation has theoretical and practical implications. Most importantly, it implies that if one employs 
an optimization program that searches for a physical (positive) quantum state, any quantum tomography procedure 
whose sensing map corresponds to compressed sensing measurements will exhibit the compressed sensing effect. This 
unifies apparently distinct numerical procedures such as maximum-likelihood and nuclear-norm minimization under 
the umbrella of compressed sensing. From a practical perspective, when the positivity constraint is included, one 
can achieve compressed sensing estimation with any efficient convex optimization, such as ADMM algorithms [55] 
developed to handle large dimensional matrices. 
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FIG. 2. Comparison of different estimators with and without the positivity constraint. We simulate a three 
qubit system in which we produce a pure state, \ipo), and perform random projective measurements in the Pauli basis (all 
simulations are averaged over 10 Haar-random states), (a) Estimation without the positivity constraint. We consider 
here the ideal case of a noiseless measurement record and plot the Frobenius distance between state |^o) and the solution 
of an estimator, p, ^/Tr(p — |V , oXV 7 oI) 2 - The estimations are obtained by solving three different convex optimizations: (i) 
Nuclear-norm minimization: Eq. q4B , (ii) Least-squares minimization: p = argminju ||p — ^4[M]||2, and (iii) Trace minimization: 
p = argmiiiM Tr(M) s.t. p = A\M], The figure clearly shows that only nuclear-norm minimization achieves compressed 
sensing, i.e., exact recovery of the density matrix with a small number of measurement bases (here m = 10). Least-squares and 
trace minimization require a full informationally complete measurement record with 27 Pauli bases to achieve exact recovery, 
(b) Estimation with the positivity constraint. We plot here the infidelity between |Vo) and the solution of different 
estimators, 1 — {i/jo\p\'4’o) ■ The estimations are obtained by solving three different convex optimizations where the feasible set 
is constrained to the cone of positive matrices: (i) Nonnegative trace minimization (equivalently nuclear-norm minimization), 
Eq. (|8| (ii) Nonnegative least-squares minimization, Eq. 0. and (iii) The maximum-(log)likelihood estimator based on the 
algorithm described in |32j . In the main plot we simulate the case of an ideal noiseless measurement record; in the inset plot 
we simulate a statistically noisy measurement record that corresponds to frequency of outcomes for iV re p = 200 repetitions. 
This figure exemplifies that when restricted to the set of positive matrices, all estimators are compressed sensing estimators. 
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FIG. 3. Ten-qubit state tomography. We simulate data based on random Pauli-projective measurements (see text). The 
quantum tomography employs nonnegative least-squares, according to Eq. 0. This algorithm can efficiently handle large 
dimensional matrices |30j . We show the infidelity as a function of the number of measurement settings averaged over 10 
Haar-random pure states (error bars shown). The simulation clearly exhibits the compressed sensing effect. 
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Since the compressed sensing measurements that satisfy the restricted isometry property are robust to measurement 
noise and model imperfection, this allows us to construct strictly-IC measurements that are robust against such noise. 
That is, if there is measurement noise and/or we strictly violate the assumption that rank(p) < r (but only require 
that the density matrix is close to a density matrix with rank < r), then we are guaranteed that the estimation will 
be close to the unknown matrix. 

Finally, though we have presented our results in the context of quantum-state tomography they are general and 
apply to the case of positive sparse vectors and positive rank-deficient matrices, the latter exemplified by quantum- 
process tomography. 


METHODS 

Compressed sensing measurements in matrix reconstruction. A sensing map for matrix reconstruction, A, 
is defined as a vector-valued linear map on a d x d Hermitian matrix, yi = A[M\i. This yields “compressed sensing 
measurements for rank-r” if it guarantees a robust recovery of matrices with rank < r by solving a nuclear-norm 
minimization program, e.g., the compressed sensing heuristic, 

M = argmin ||M||* s.t. ||A[M] - /|| 2 < e, (10) 

where / is the noisy measurement record, f = y + e. When the matrix is promised to have rank r, the number of 
sufficient samples is of order 0(rd), with possible logarithmic corrections, and the distance between the reconstruction 
M and Mq is O(e), where ||e|| 2 < e. In this sense, the reconstruction is “robust,” and compressed sensing when r«d. 
An analogous definition holds in the case of sensing maps for sparse vector reconstruction. 

A sufficient condition that a sensing map yields compressed sensing measurements for matrix reconstruction is if it 
satisfies the “restricted isometry property.” The map satisfies the restricted isometry property for rank-r if there is 
some constant 0 < S r < 1 such that, 


(l-S r )\\M\\ 1 2 3 F <\\A[M]\\ 2 2 <(l + S r 


2 


(ii) 


holds for all Hermitian matrices M with rank < r, where ||M||f = \/Tr(MtM). The smallest constant S r for which 
this property holds is called the restricted isometry constant. 

With small isometry constant S r , the sensing map A acts almost like an isometry when applied to rank < r matrices, 
and thus allows us to effectively invert the measurement data to determine the matrix. Depending on the context, 
there are various results in the compressed sensing literature that apply for different values of the isometry constant. 
For example, Candes and collaborators [S], show that the compressed sensing theory is applied when 5± r < \/2 — 1 
(see Supplementary information Section B). 

Our results are general and apply whenever the sensing map corresponds to compressed sensing measurements that 
assures robust recovery through the solution of Eq (10). While the restricted isometry property is sufficient, our 


results are applicable in other cases, such as those described in m where a robust recovery is guaranteed by 0{rd ) 
generic rank-one projections, or by 0(rd\og{d)) projectors onto random elements of an approximate 4-design. 


Numerical experiments. In our numerical experiments, we simulate independent measurements of random Pauli 
bases on a Haar-random pure state of dimension d = 2 n , po = IV’oKV’ol- The measurement record, given by the 
frequency of outcomes, /, is generated by sampling A/ep times from the probability distribution p = Tr(.Epo). Here 
E is the vector of POVM elements, each corresponding to a tensor product of projectors onto the eigenbasis of Pauli 
observables, E^ = (gi/ =1 P Qi , where p indexes the series of a^, a = x,y,z, and P ai £ {|tai)(taiU4-a i )(4'ail}- The 
measurement record is then used in various estimators m ■ We measure the performance by the average infidelity 
over 10 random pure states, 1 — (ipo\p\^o). 
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SUPPLEMENTARY INFORMATION 

Section A: Proof of Theorem 1 

In a direct extension of Bruckstein et al. [26], we first show that under the appropriate conditions, positivity implies 
that the set {M|A[M] = p, M > 0} contains a single element. The proof does not assumes that M is a quantum 
state nor that the sensing map is related to a POVM of any kind. It applies for the general circumstances of positive 
matrices and sensing maps. 

Consider the sensing map A[0] = Tr(.E0), where the elements of the vector E are some matrices, E^, p = 
1,2,..., m. Suppose that the span of {E^} is strictly positive, namely, 

m 

3h= • • .,U T s.t. h T E = ^2 h^E^ = W > 0, 

m=i 

with W = BB T a d x d (strictly) positive matrix. This allows us perform a change of representation to an auxiliary 
problem. Defining, T>[0] = Tr (B~ 1 EB 1 ~ 1 Q), and Z=B 7 MB, there is one-to-one correspondence between the solution 
sets 


{M | A[M] =p, M > 0} and {Z \ V[Z} = p, Z > 0}, 
and the rank of the solutions are the same. An important property of the modified problem is that 

Tr (Z) = Tr(WM) = h J p = c. 

That is, the trace of Z is fixed, and its value depends on p and the choice of h. Therefore, we can refine the above 
statement: there is a one-to-one correspondence between the solution sets 

{M\A[M] =p,M> 0} and {Z\V[Z} = p,Z> 0, TrZ = c}, 

and the rank of the solutions are the same. 

Lemma 1. Assume p = V[Z 0 \ for some Z 0 > 0 with rank(^ 0 ) 0 r. If V satisfies the restricted isometry property 
with constant 5^ r < \f2 — 1, then the set {Z \ T>[Z] = p, Z > 0, TrZ = c} contains only one element, Z = Zq. 

Proof. The Lemma assumes that rank(Zo) < r and 5± r < \/2 — 1. Therefore, according to the Theorem 2.4 of [2 
(applied to the noiseless case), the problem 

Z = argmin. H^H* s.t. V[Z] = p, 

has a unique minimizer Z = Zq. But since the feasible set contains only positive matrices, then H^H* = TrZ. 
Therefore any other positive solution to V[Z] = p must have a trace larger than Tr (Z) = c, thus it is necessarily not 
in the set {Z \ V[Z] = p, Z > 0, TrZ = c}. Hence, this set contains only one element, as claimed. □ 

Since, the set {Z\T>[Z} = p,Z > 0, Tr Z = c} contains only one element, so does the set {M\A[M] = p,M > 0} 
given that T> satisfies the restricted isometry property with constant $ 4 r < \[2 — 1. In general, it is required to find a 
transformation of the sensing map A that yields T> with 8 ^ < 1. 

This general result can be applied to the specific case of quantum tomography, where now M = p, a positive- 
semidefinite density matrix, and the elements of the vector E form a (trace preserving) POVM. In this case, we 
can choose h = (1,1,..., 1) T , a vector whose elements are all 1, then W = h J E = E^ = 1, and thus V = A. 

Therefore, in this particular case, 8 i r (T>) = 84 r (A). 
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Note that in order to show the generality of our result in CS, we have chosen to present arguments in the course 
of the proof that apply to general positive matrices and sensing maps and only then to apply it specifically to the 
quantum tomography case. In the quantum case, however, this theorem follows directly, without the need for the 
construction of Bruckstein et ai. For a trace-preserving POVM, it follows immediately that V = A and Z = p. 
Therefore, for quantum tomography all of the arguments above that are made with relation to T> and Z can be made 
on A and p directly, and Theorem 1 follows as extension of [5], applied to positive matrices [M]. 


Section B: Bound on \\p — Po\\f 


Consider the following heuristic 


M = argmin. ||M||* s.t. \\A(M) - f\\ 2 < e 


M 


( 12 ) 


Suppose that Mq is an arbitrary rank matrix. Let Mo = [7diag(er)U* be the singular value decomposition of Mq 
where er is the list of ordered singular values ay > <j 2 > ■ ■ ■ > ad- We let M r be the part of M 0 corresponding to its 
largest r singular values. By definition M c = M 0 — M r corresponds to the d — r smallest singular values of M 0 , i.e., 
the ‘tail’ of Mq. 

To bound ||/3 — po||f we use the following Lemma. 

Lemma 2. Suppose 6^ r < y/2— 1 and let Mq be a matrix such that ||^4(Mo) — f\\ 2 < e. Then the solution M to (12) 
obeys 


"' h <” c ° = dSfe 


||M-Mo||f <C 0 e + C^-\\M c \\* 

-, and C\ = are constants depending only on the isometry constant 5i r . 


(13) 


Lemma 2, is somewhat different than Lemma 3.2 proved in [5]. However the proof of Lemma 3.2 applies directly 
to Lemma 2 

An important special case of this Lemma 2 is for a signal matrix Mq with rank(Mo) < r, that satisfies \\A(Mq) — 
f ||2 < e. For this case, M c = 0, and therefore 


\\M — Mo||f < Coe. 


(14) 


We are now ready to bound \\p — Po||f- Using the triangle inequality and the result of equation (14) we get 

11/5 - Po\\f < ||/5 - p*\\f + ||p* - Po||f 

< \\p-p*\\ F +C 0 e. (15) 

where p* is the solution for equation (12), and Co = . 

To bound ||p — P*||f we use the result of Lemma 2 which give an upper bound on \\p — p*||f- The only assumption 
regarding p that entered the proof of Lemma 2 is that it is a feasible matrix, || A(p) — f\\ < e. However, p is a feasible 
matrix for the problem of (12) since by its definition it minimizes ||A(-) — /1|. Therefore, necessarily, ||-4(p) — /|| < e. 
Applying the result of Lemma 2 to bound 11 p — p* \ \ f , we can rewrite inequality (15) as 


|p — Po||f < 2Cbe + C \y -||(p) c 


where Ci 


1-(1-C2)5 i7 . 
1-(1 + C2)S 4r ' 


Section C: Proof of formal equivalence between equation (8) and equation (9) of the main text 

Consider the two minimization programs 

Ptr = min. Trp 

p 

subject to: ||/ — A(p )||2 < e 


(16) 













12 


and 


pis = min. ||/ - A(p)\\ 2 

P 

subject to: Tip = t , 


(17) 


where as before, A is a linear map, A : R dxd —► |R m and / is the record / = -4(po) + e, where p is the density 
matrix and e denotes the noise. Similarly to Ref. |5], we take the map to be of the form A(p) = Tr (Ep), where 
E = (Ei, E 2 ,..., E m ), and Ej, j = 1,2,..., m, are d x d matrices represent the measurement operators. Inspired by 
the formulation of measurement we further assume that Ej > 0 and y"V Ej = 1. 

Lemma 3 For a given map A and a record /, if t = fj ~ s/me, then the two convex programs (16) and 

are the mathematically equivalent. 


Proof. Since the objective functions and the constrains of the two convex programs are linear or quadratic, both 
programs have zero duality gap, thus a strong duality holds for them both. To prove the Lemma we construct and 
solve the dual problem of each (primal) program and than show that for t = fj — \/me the solutions of the two 

corresponding dual problems coincide. Since there is no duality gap for these problems, this implies that the solutions 
to the two primal problems, first, equal to the solutions of the dual problems and, second, coincide with each other, 
as claimed. 


The (conic) Lagrangian of (16) is given by, 


L(p, u , A) = Tr p + ^2 Uj(fj - Tr(Ejp)) - Ae, 


(18) 


with the dual variable (Lagrange multipliers) ||m ||2 < A, and A > 0. The dual function is obtained by min. p L, which 
is given by the condition V p L = 0. Using V p Tiyn4 = A we get 


V p L — u jEj + 1 — 0 =t- UjEj — 1, 

j l=i 


(19) 


and therefore, 


min. L = 

p 


£Vi-A- 

j 


with 


< A, and A > 0. The dual problem of (16) thus reads 


G?tr = max. Ujfj — Ae 
j 

subject to: ||ti ||2 < A 
A > 0 


( 20 ) 


( 21 ) 


In fact we can solve this program exactly, equation (19), Y^jLi u jEj = 1, together with Y^j=iEj = 1 implies a 
solution Uj = 1 for j = 1,..., m. Therefore the condition ||u|| 2 < A now reads s/m < A. Moreover V • Ujfj = V • fj. 


Plugging all that in equation (21), we obtain 


dtr = max. fj — Ae 
1=1 

subject to: \fm < A. (22) 


The solution of this problem is given by taking the minimum value of A, A = s/m, that is, dt r = rj'j-., v»"• 
Since we have a strong duality in this program we get that 


Ptv = d tI = £ fj - s/rne. 
1=1 


(23) 
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Let p tr be the argument that solves (16), then, 

m 

Trptr =Ptr = ^2 fj - 
1=1 

Next, we consider the problem of © which is equivalent to, 

min. max. (v, f - Tr (Ep)) 

p V 

subject to: Trp = t 

W2<i. 

Thus, the (conic) Lagrangian function of this problem is given by, 

L{P, = ~ T r ( E jP)) + MTr p - t), 


(24) 


(25) 


(26) 


with ||r ?||2 < 1. The dual function is obtained by min. p L, which is given by the condition V p L = 0. Using V p Tr XA = 
A, we get 


V p L = — VjEj + fit = 0 => v jEj = p 1, 
1 1=1 


and therefore, 


mm, 

p 


■ L = © v ih ~ ^ 


(27) 


(28) 


with || v || 2 < 1. The dual problem thus reads 


d\ s = max. Vjfj — pt 


v,p 


subject to: ||v ||2 < 1 


(29) 


Similarly to the previous case, we can solve this program exactly, equation (27), J2j-i v jEj = pi, together with 
Yl'jLi Ej = 1 implies a solution Vj = p for j = 1,..., m. Therefore, the condition ||u ||2 < 1 now reads \frnp < 1, that 
is, p < 1 /^/m. Moreover JT v jfj = fj- Plugging all that in equation (29), we obtain 


m 

d]s = max. fo *) 

3 =i 

subject to: p < 1 /-,/m. 


(30) 


The solution to this problem is given by taking the maximum value of p, p = 1/y/m, i.e., di a = (X(j=i fj ~ t)/y/m. 
Since we have a strong duality in this program we get that 


^ m 

Pis = dis = —p= (V / 


-t . 


(31) 


l=i 


Let pis be the argument that solves (16), then Trpi s = t and ||/ — TY(.Epi s )|| 2 = pi s . 

The mathematical equivalence between the two programs, ( |16[ ) and © is obtained for taking t in equation © 
to be equal to p tr = Trp tr . For this value of t, t = Trp tr , we obtain 


d ls =\\f-Tr(Ep ls )\\ 2 =e. 


(32) 


The problem of (16) finds a matrix p tr which has the minimal trace and satisfies \\f — Tr(£7p tr )|| 2 < e. Using the 
value of t = Trpt r in (17), means that the program ( |17| finds the matrix p\ s which has the minimal ||/ — Tr(£^pj s )||2 
and satisfies Trpi s = Trp tr - We showed that the solution is such that the minimal value is ||/ — Tr(.Epi s )|| 2 = e. This 
implies that every element in the set {p|Trp = Trp tr } satishes ||/ — Tr(.Ep)|| 2 > e. Therefore, we conclude that, the 
solution of (16) necessarily satisfies ||^ — Tr(J57p tr )||2 = e. This in turn imply that both programs ( [T6| and ( p~7] ) return 
the same solution p with Trp = X(j=i fj ~~ \Ane and ||/ — Tr(£7p)|| 2 = e. □ 
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The programs (16) and with t = Tr(ptr) remain equivalent, upon adding any convex constraint to them both. 
In particular, the two programs 


min. Trp 

p 

subject to: ||/ - A(p )|| 2 < e 

p > 0 (33) 


and 


min. \\f -A(p)\\ 2 
p 

subject to: Trp = Tr(p tr ) 

p > 0 (34) 


are mathematically equivalent as claimed. 

Lastly, we remark that the while the proof of equivalence was given here using the two-norm, || • || 2 , it holds for 
any norm. Therefore, the mathematical equivalence between the programs of (331, (34) also holds if we replace the 
two-nornr that appears in these programs by any other norm. 





